Search Results: "merge"

18 March 2024

Gunnar Wolf: After miniDebConf Santa Fe

Last week we held our promised miniDebConf in Santa Fe City, Santa Fe province, Argentina just across the river from Paran , where I have spent almost six beautiful months I will never forget. Around 500 Kilometers North from Buenos Aires, Santa Fe and Paran are separated by the beautiful and majestic Paran river, which flows from Brazil, marks the Eastern border of Paraguay, and continues within Argentina as the heart of the litoral region of the country, until it merges with the Uruguay river (you guessed right the river marking the Eastern border of Argentina, first with Brazil and then with Uruguay), and they become the R o de la Plata. This was a short miniDebConf: we were lent the APUL union s building for the weekend (thank you very much!); during Saturday, we had a cycle of talks, and on sunday we had more of a hacklab logic, having some unstructured time to work each on their own projects, and to talk and have a good time together. We were five Debian people attending: santiago debacle eamanu dererk gwolf @debian.org. My main contact to kickstart organization was Mart n Bayo. Mart n was for many years the leader of the Technical Degree on Free Software at Universidad Nacional del Litoral, where I was also a teacher for several years. Together with Leo Mart nez, also a teacher at the tecnicatura, they contacted us with Guillermo and Gabriela, from the APUL non-teaching-staff union of said university. We had the following set of talks (for which there is a promise to get electronic record, as APUL was kind enough to record them! of course, I will push them to our usual conference video archiving service as soon as I get them)
Hour Title (Spanish) Title (English) Presented by
10:00-10:25 Introducci n al Software Libre Introduction to Free Software Mart n Bayo
10:30-10:55 Debian y su comunidad Debian and its community Emanuel Arias
11:00-11:25 Por qu sigo contribuyendo a Debian despu s de 20 a os? Why am I still contributing to Debian after 20 years? Santiago Ruano
11:30-11:55 Mi identidad y el proyecto Debian: Qu es el llavero OpenPGP y por qu ? My identity and the Debian project: What is the OpenPGP keyring and why? Gunnar Wolf
12:00-13:00 Explorando las masculinidades en el contexto del Software Libre Exploring masculinities in the context of Free Software Gora Ortiz Fuentes - Jos Francisco Ferro
13:00-14:30 Lunch
14:30-14:55 Debian para el d a a d a Debian for our every day Leonardo Mart nez
15:00-15:25 Debian en las Raspberry Pi Debian in the Raspberry Pi Gunnar Wolf
15:30-15:55 Device Trees Device Trees Lisandro Dami n Nicanor Perez Meyer (videoconferencia)
16:00-16:25 Python en Debian Python in Debian Emmanuel Arias
16:30-16:55 Debian y XMPP en la medici n de viento para la energ a e lica Debian and XMPP for wind measuring for eolic energy Martin Borgert
As it always happens DebConf, miniDebConf and other Debian-related activities are always fun, always productive, always a great opportunity to meet again our decades-long friends. Lets see what comes next!

13 March 2024

Russell Coker: The Shape of Computers

Introduction There have been many experiments with the sizes of computers, some of which have stayed around and some have gone away. The trend has been to make computers smaller, the early computers had buildings for them. Recently for come classes computers have started becoming as small as could be reasonably desired. For example phones are thin enough that they can blow away in a strong breeze, smart watches are much the same size as the old fashioned watches they replace, and NUC type computers are as small as they need to be given the size of monitors etc that they connect to. This means that further development in the size and shape of computers will largely be determined by human factors. I think we need to consider how computers might be developed to better suit humans and how to write free software to make such computers usable without being constrained by corporate interests. Those of us who are involved in developing OSs and applications need to consider how to adjust to the changes and ideally anticipate changes. While we can t anticipate the details of future devices we can easily predict general trends such as being smaller, higher resolution, etc. Desktop/Laptop PCs When home computers first came out it was standard to have the keyboard in the main box, the Apple ][ being the most well known example. This has lost popularity due to the demand to have multiple options for a light keyboard that can be moved for convenience combined with multiple options for the box part. But it still pops up occasionally such as the Raspberry Pi 400 [1] which succeeds due to having the computer part being small and light. I think this type of computer will remain a niche product. It could be used in a add a screen to make a laptop as opposed to the add a keyboard to a tablet to make a laptop model but a tablet without a keyboard is more useful than a non-server PC without a display. The PC as box with connections for keyboard, display, etc has a long future ahead of it. But the sizes will probably decrease (they should have stopped making PC cases to fit CD/DVD drives at least 10 years ago). The NUC size is a useful option and I think that DVD drives will stop being used for software soon which will allow a range of smaller form factors. The regular laptop is something that will remain useful, but the tablet with detachable keyboard devices could take a lot of that market. Full functionality for all tasks requires a keyboard because at the moment text editing with a touch screen is an unsolved problem in computer science [2]. The Lenovo Thinkpad X1 Fold [3] and related Lenovo products are very interesting. Advances in materials allow laptops to be thinner and lighter which leaves the screen size as a major limitation to portability. There is a conflict between desiring a large screen to see lots of content and wanting a small size to carry and making a device foldable is an obvious solution that has recently become possible. Making a foldable laptop drives a desire for not having a permanently attached keyboard which then makes a touch screen keyboard a requirement. So this means that user interfaces for PCs have to be adapted to work well on touch screens. The Think line seems to be continuing the history of innovation that it had when owned by IBM. There are also a range of other laptops that have two regular screens so they are essentially the same as the Thinkpad X1 Fold but with two separate screens instead of one folding one, prices are as low as $600US. I think that the typical interfaces for desktop PCs (EG MS-Windows and KDE) don t work well for small devices and touch devices and the Android interface generally isn t a good match for desktop systems. We need to invent more options for this. This is not a criticism of KDE, I use it every day and it works well. But it s designed for use cases that don t match new hardware that is on sale. As an aside it would be nice if Lenovo gave samples of their newest gear to people who make significant contributions to GUIs. Give a few Thinkpad Fold devices to KDE people, a few to GNOME people, and a few others to people involved in Wayland development and see how that promotes software development and future sales. We also need to adopt features from laptops and phones into desktop PCs. When voice recognition software was first released in the 90s it was for desktop PCs, it didn t take off largely because it wasn t very accurate (none of them recognised my voice). Now voice recognition in phones is very accurate and it s very common for desktop PCs to have a webcam or headset with a microphone so it s time for this to be re-visited. GPS support in laptops is obviously useful and can work via Wifi location, via a USB GPS device, or via wwan mobile phone hardware (even if not used for wwan networking). Another possibility is using the same software interfaces as used for GPS on laptops for a static definition of location for a desktop PC or server. The Interesting New Things Watch Like The wrist-watch [4] has been a standard format for easy access to data when on the go since it s military use at the end of the 19th century when the practical benefits beat the supposed femininity of the watch. So it seems most likely that they will continue to be in widespread use in computerised form for the forseeable future. For comparison smart phones have been in widespread use as pocket watches for about 10 years. The question is how will watch computers end up? Will we have Dick Tracy style watch phones that you speak into? Will it be the current smart watch functionality of using the watch to answer a call which goes to a bluetooth headset? Will smart watches end up taking over the functionality of the calculator watch [5] which was popular in the 80 s? With today s technology you could easily have a fully capable PC strapped to your forearm, would that be useful? Phone Like Folding phones (originally popularised as Star Trek Tricorders) seem likely to have a long future ahead of them. Engineering technology has only recently developed to the stage of allowing them to work the way people would hope them to work (a folding screen with no gaps). Phones and tablets with multiple folds are coming out now [6]. This will allow phones to take much of the market share that tablets used to have while tablets and laptops merge at the high end. I ve previously written about Convergence between phones and desktop computers [7], the increased capabilities of phones adds to the case for Convergence. Folding phones also provide new possibilities for the OS. The Oppo OnePlus Open and the Google Pixel Fold both have a UI based around using the two halves of the folding screen for separate data at some times. I think that the current user interfaces for desktop PCs don t properly take advantage of multiple monitors and the possibilities raised by folding phones only adds to the lack. My pet peeve with multiple monitor setups is when they don t make it obvious which monitor has keyboard focus so you send a CTRL-W or ALT-F4 to the wrong screen by mistake, it s a problem that also happens on a single screen but is worse with multiple screens. There are rumours of phones described as three fold (where three means the number of segments with two folds between them), it will be interesting to see how that goes. Will phones go the same way as PCs in terms of having a separation between the compute bit and the input device? It s quite possible to have a compute device in the phone form factor inside a secure pocket which talks via Bluetooth to another device with a display and speakers. Then you could change your phone between a phone-size display and a tablet sized display easily and when using your phone a thief would not be able to easily steal the compute bit (which has passwords etc). Could the watch part of the phone (strapped to your wrist and difficult to steal) be the active part and have a tablet size device as an external display? There are already announcements of smart watches with up to 1GB of RAM (same as the Samsung Galaxy S3), that s enough for a lot of phone functionality. The Rabbit R1 [8] and the Humane AI Pin [9] have some interesting possibilities for AI speech interfaces. Could that take over some of the current phone use? It seems that visually impaired people have been doing badly in the trend towards touch screen phones so an option of a voice interface phone would be a good option for them. As an aside I hope some people are working on AI stuff for FOSS devices. Laptop Like One interesting PC variant I just discovered is the Higole 2 Pro portable battery operated Windows PC with 5.5 touch screen [10]. It looks too thick to fit in the same pockets as current phones but is still very portable. The version with built in battery is $AU423 which is in the usual price range for low end laptops and tablets. I don t think this is the future of computing, but it is something that is usable today while we wait for foldable devices to take over. The recent release of the Apple Vision Pro [11] has driven interest in 3D and head mounted computers. I think this could be a useful peripheral for a laptop or phone but it won t be part of a primary computing environment. In 2011 I wrote about the possibility of using augmented reality technology for providing a desktop computing environment [12]. I wonder how a Vision Pro would work for that on a train or passenger jet. Another interesting thing that s on offer is a laptop with 7 touch screen beside the keyboard [13]. It seems that someone just looked at what parts are available cheaply in China (due to being parts of more popular devices) and what could fit together. I think a keyboard should be central to the monitor for serious typing, but there may be useful corner cases where typing isn t that common and a touch-screen display is of use. Developing a range of strange hardware and then seeing which ones get adopted is a good thing and an advantage of Ali Express and Temu. Useful Hardware for Developing These Things I recently bought a second hand Thinkpad X1 Yoga Gen3 for $359 which has stylus support [14], and it s generally a great little laptop in every other way. There s a common failure case of that model where touch support for fingers breaks but the stylus still works which allows it to be used for testing touch screen functionality while making it cheap. The PineTime is a nice smart watch from Pine64 which is designed to be open [15]. I am quite happy with it but haven t done much with it yet (apart from wearing it every day and getting alerts etc from Android). At $50 when delivered to Australia it s significantly more expensive than most smart watches with similar features but still a lot cheaper than the high end ones. Also the Raspberry Pi Watch [16] is interesting too. The PinePhonePro is an OK phone made to open standards but it s hardware isn t as good as Android phones released in the same year [17]. I ve got some useful stuff done on mine, but the battery life is a major issue and the screen resolution is low. The Librem 5 phone from Purism has a better hardware design for security with switches to disable functionality [18], but it s even slower than the PinePhonePro. These are good devices for test and development but not ones that many people would be excited to use every day. Wwan hardware (for accessing the phone network) in M.2 form factor can be obtained for free if you have access to old/broken laptops. Such devices start at about $35 if you want to buy one. USB GPS devices also start at about $35 so probably not worth getting if you can get a wwan device that does GPS as well. What We Must Do Debian appears to have some voice input software in the pocketsphinx package but no documentation on how it s to be used. This would be a good thing to document, I spent 15 mins looking at it and couldn t get it going. To take advantage of the hardware features in phones we need software support and we ideally don t want free software to lag too far behind proprietary software which IMHO means the typical Android setup for phones/tablets. Support for changing screen resolution is already there as is support for touch screens. Support for adapting the GUI to changed screen size is something that needs to be done even today s hardware of connecting a small laptop to an external monitor doesn t have the ideal functionality for changing the UI. There also seem to be some limitations in touch screen support with multiple screens, I haven t investigated this properly yet, it definitely doesn t work in an expected manner in Ubuntu 22.04 and I haven t yet tested the combinations on Debian/Unstable. ML is becoming a big thing and it has some interesting use cases for small devices where a smart device can compensate for limited input options. There s a lot of work that needs to be done in this area and we are limited by the fact that we can t just rip off the work of other people for use as training data in the way that corporations do. Security is more important for devices that are at high risk of theft. The vast majority of free software installations are way behind Android in terms of security and we need to address that. I have some ideas for improvement but there is always a conflict between security and usability and while Android is usable for it s own special apps it s not usable in a I want to run applications that use any files from any other applicationsin any way I want sense. My post about Sandboxing Phone apps is relevant for people who are interested in this [19]. We also need to extend security models to cope with things like ok google type functionality which has the potential to be a bug and the emerging class of LLM based attacks. I will write more posts about these thing. Please write comments mentioning FOSS hardware and software projects that address these issues and also documentation for such things.

Freexian Collaborators: Debian Contributions: Upcoming Improvements to Salsa CI, /usr-move, packaging simplemonitor, and more! (by Utkarsh Gupta)

Contributing to Debian is part of Freexian s mission. This article covers the latest achievements of Freexian and their collaborators. All of this is made possible by organizations subscribing to our Long Term Support contracts and consulting services.

/usr-move, by Helmut Grohne Much of the work was spent on handling interaction with time time64 transition and sending patches for mitigating fallout. The set of packages relevant to debootstrap is mostly converted and the patches for glibc and base-files have been refined due to feedback from the upload to Ubuntu noble. Beyond this, he sent patches for all remaining packages that cannot move their files with dh-sequence-movetousr and packages using dpkg-divert in ways that dumat would not recognize.

Upcoming improvements to Salsa CI, by Santiago Ruano Rinc n Last month, Santiago Ruano Rinc n started the work on integrating sbuild into the Salsa CI pipeline. Initially, Santiago used sbuild with the unshare chroot mode. However, after discussion with josch, jochensp and helmut (thanks to them!), it turns out that the unshare mode is not the most suitable for the pipeline, since the level of isolation it provides is not needed, and some test suites would fail (eg: krb5). Additionally, one of the requirements of the build job is the use of ccache, since it is needed by some C/C++ large projects to reduce the compilation time. In the preliminary work with unshare last month, it was not possible to make ccache to work. Finally, Santiago changed the chroot mode, and now has a couple of POC (cf: 1 and 2) that rely on the schroot and sudo, respectively. And the good news is that ccache is successfully used by sbuild with schroot! The image here comes from an example of building grep. At the end of the build, ccache -s shows the statistics of the cache that it used, and so a little more than half of the calls of that job were cacheable. The most important pieces are in place to finish the integration of sbuild into the pipeline. Other than that, Santiago also reviewed the very useful merge request !346, made by IOhannes zm lnig to autodetect the release from debian/changelog. As agreed with IOhannes, Santiago is preparing a merge request to include the release autodetection use case in the very own Salsa CI s CI.

Packaging simplemonitor, by Carles Pina i Estany Carles started using simplemonitor in 2017, opened a WNPP bug in 2022 and started packaging simplemonitor dependencies in October 2023. After packaging five direct and indirect dependencies, Carles finally uploaded simplemonitor to unstable in February. During the packaging of simplemonitor, Carles reported a few issues to upstream. Some of these were to make the simplemonitor package build and run tests reproducibly. A reproducibility issue was reprotest overriding the timezone, which broke simplemonitor s tests. There have been discussions on resolving this upstream in simplemonitor and in reprotest, too. Carles also started upgrading or improving some of simplemonitor s dependencies.

Miscellaneous contributions
  • Stefano Rivera spent some time doing admin on debian.social infrastructure. Including dealing with a spike of abuse on the Jitsi server.
  • Stefano started to prepare a new release of dh-python, including cleaning out a lot of old Python 2.x related code. Thanks to Niels Thykier (outside Freexian) for spear-heading this work.
  • DebConf 24 planning is beginning. Stefano discussed venues and finances with the local team and remotely supported a site-visit by Nattie (outside Freexian).
  • Also in the DebConf 24 context, Santiago took part in discussions and preparations related to the Content Team.
  • A JIT bug was reported against pypy3 in Debian Bookworm. Stefano bisected the upstream history to find the patch (it was already resolved upstream) and released an update to pypy3 in bookworm.
  • Enrico participated in /usr-merge discussions with Helmut.
  • Colin Watson backported a python-channels-redis fix to bookworm, rediscovered while working on debusine.
  • Colin dug into a cluster of celery build failures and tracked the hardest bit down to a Python 3.12 regression, now fixed in unstable. celery should be back in testing once the 64-bit time_t migration is out of the way.
  • Thorsten Alteholz uploaded a new upstream version of cpdb-libs. Unfortunately upstream changed the naming of their release tags, so updating the watch file was a bit demanding. Anyway this version 2.0 is a huge step towards introduction of the new Common Print Dialog Backends.
  • Helmut send patches for 48 cross build failures.
  • Helmut changed debvm to use mkfs.ext4 instead of genext2fs.
  • Helmut sent a debci MR for improving collector robustness.
  • In preparation for DebConf 25, Santiago worked on the Brest Bid.

9 March 2024

Reproducible Builds: Reproducible Builds in February 2024

Welcome to the February 2024 report from the Reproducible Builds project! In our reports, we try to outline what we have been up to over the past month as well as mentioning some of the important things happening in software supply-chain security.

Reproducible Builds at FOSDEM 2024 Core Reproducible Builds developer Holger Levsen presented at the main track at FOSDEM on Saturday 3rd February this year in Brussels, Belgium. However, that wasn t the only talk related to Reproducible Builds. However, please see our comprehensive FOSDEM 2024 news post for the full details and links.

Maintainer Perspectives on Open Source Software Security Bernhard M. Wiedemann spotted that a recent report entitled Maintainer Perspectives on Open Source Software Security written by Stephen Hendrick and Ashwin Ramaswami of the Linux Foundation sports an infographic which mentions that 56% of [polled] projects support reproducible builds .

Mailing list highlights From our mailing list this month:

Distribution work In Debian this month, 5 reviews of Debian packages were added, 22 were updated and 8 were removed this month adding to Debian s knowledge about identified issues. A number of issue types were updated as well. [ ][ ][ ][ ] In addition, Roland Clobus posted his 23rd update of the status of reproducible ISO images on our mailing list. In particular, Roland helpfully summarised that all major desktops build reproducibly with bullseye, bookworm, trixie and sid provided they are built for a second time within the same DAK run (i.e. [within] 6 hours) and that there will likely be further work at a MiniDebCamp in Hamburg. Furthermore, Roland also responded in-depth to a query about a previous report
Fedora developer Zbigniew J drzejewski-Szmek announced a work-in-progress script called fedora-repro-build that attempts to reproduce an existing package within a koji build environment. Although the projects README file lists a number of fields will always or almost always vary and there is a non-zero list of other known issues, this is an excellent first step towards full Fedora reproducibility.
Jelle van der Waa introduced a new linter rule for Arch Linux packages in order to detect cache files leftover by the Sphinx documentation generator which are unreproducible by nature and should not be packaged. At the time of writing, 7 packages in the Arch repository are affected by this.
Elsewhere, Bernhard M. Wiedemann posted another monthly update for his work elsewhere in openSUSE.

diffoscope diffoscope is our in-depth and content-aware diff utility that can locate and diagnose reproducibility issues. This month, Chris Lamb made a number of changes such as uploading versions 256, 257 and 258 to Debian and made the following additional changes:
  • Use a deterministic name instead of trusting gpg s use-embedded-filenames. Many thanks to Daniel Kahn Gillmor dkg@debian.org for reporting this issue and providing feedback. [ ][ ]
  • Don t error-out with a traceback if we encounter struct.unpack-related errors when parsing Python .pyc files. (#1064973). [ ]
  • Don t try and compare rdb_expected_diff on non-GNU systems as %p formatting can vary, especially with respect to MacOS. [ ]
  • Fix compatibility with pytest 8.0. [ ]
  • Temporarily fix support for Python 3.11.8. [ ]
  • Use the 7zip package (over p7zip-full) after a Debian package transition. (#1063559). [ ]
  • Bump the minimum Black source code reformatter requirement to 24.1.1+. [ ]
  • Expand an older changelog entry with a CVE reference. [ ]
  • Make test_zip black clean. [ ]
In addition, James Addison contributed a patch to parse the headers from the diff(1) correctly [ ][ ] thanks! And lastly, Vagrant Cascadian pushed updates in GNU Guix for diffoscope to version 255, 256, and 258, and updated trydiffoscope to 67.0.6.

reprotest reprotest is our tool for building the same source code twice in different environments and then checking the binaries produced by each build for any differences. This month, Vagrant Cascadian made a number of changes, including:
  • Create a (working) proof of concept for enabling a specific number of CPUs. [ ][ ]
  • Consistently use 398 days for time variation rather than choosing randomly and update README.rst to match. [ ][ ]
  • Support a new --vary=build_path.path option. [ ][ ][ ][ ]

Website updates There were made a number of improvements to our website this month, including:

Reproducibility testing framework The Reproducible Builds project operates a comprehensive testing framework (available at tests.reproducible-builds.org) in order to check packages and other artifacts for reproducibility. In February, a number of changes were made by Holger Levsen:
  • Debian-related changes:
    • Temporarily disable upgrading/bootstrapping Debian unstable and experimental as they are currently broken. [ ][ ]
    • Use the 64-bit amd64 kernel on all i386 nodes; no more 686 PAE kernels. [ ]
    • Add an Erlang package set. [ ]
  • Other changes:
    • Grant Jan-Benedict Glaw shell access to the Jenkins node. [ ]
    • Enable debugging for NetBSD reproducibility testing. [ ]
    • Use /usr/bin/du --apparent-size in the Jenkins shell monitor. [ ]
    • Revert reproducible nodes: mark osuosl2 as down . [ ]
    • Thanks again to Codethink, for they have doubled the RAM on our arm64 nodes. [ ]
    • Only set /proc/$pid/oom_score_adj to -1000 if it has not already been done. [ ]
    • Add the opemwrt-target-tegra and jtx task to the list of zombie jobs. [ ][ ]
Vagrant Cascadian also made the following changes:
  • Overhaul the handling of OpenSSH configuration files after updating from Debian bookworm. [ ][ ][ ]
  • Add two new armhf architecture build nodes, virt32z and virt64z, and insert them into the Munin monitoring. [ ][ ] [ ][ ]
In addition, Alexander Couzens updated the OpenWrt configuration in order to replace the tegra target with mpc85xx [ ], Jan-Benedict Glaw updated the NetBSD build script to use a separate $TMPDIR to mitigate out of space issues on a tmpfs-backed /tmp [ ] and Zheng Junjie added a link to the GNU Guix tests [ ]. Lastly, node maintenance was performed by Holger Levsen [ ][ ][ ][ ][ ][ ] and Vagrant Cascadian [ ][ ][ ][ ].

Upstream patches The Reproducible Builds project detects, dissects and attempts to fix as many currently-unreproducible packages as possible. We endeavour to send all of our patches upstream where appropriate. This month, we wrote a large number of such patches, including:

If you are interested in contributing to the Reproducible Builds project, please visit our Contribute page on our website. However, you can get in touch with us via:

7 March 2024

Gunnar Wolf: Constructed truths truth and knowledge in a post-truth world

This post is a review for Computing Reviews for Constructed truths truth and knowledge in a post-truth world , a book published in Springer Link
Many of us grew up used to having some news sources we could implicitly trust, such as well-positioned newspapers and radio or TV news programs. We knew they would only hire responsible journalists rather than risk diluting public trust and losing their brand s value. However, with the advent of the Internet and social media, we are witnessing what has been termed the post-truth phenomenon. The undeniable freedom that horizontal communication has given us automatically brings with it the emergence of filter bubbles and echo chambers, and truth seems to become a group belief. Contrary to my original expectations, the core topic of the book is not about how current-day media brings about post-truth mindsets. Instead it goes into a much deeper philosophical debate: What is truth? Does truth exist by itself, objectively, or is it a social construct? If activists with different political leanings debate a given subject, is it even possible for them to understand the same points for debate, or do they truly experience parallel realities? The author wrote this book clearly prompted by the unprecedented events that took place in 2020, as the COVID-19 crisis forced humanity into isolation and online communication. Donald Trump is explicitly and repeatedly presented throughout the book as an example of an actor that took advantage of the distortions caused by post-truth. The first chapter frames the narrative from the perspective of information flow over the last several decades, on how the emergence of horizontal, uncensored communication free of editorial oversight started empowering the netizens and created a temporary information flow utopia. But soon afterwards, algorithmic gatekeepers started appearing, creating a set of personalized distortions on reality; users started getting news aligned to what they already showed interest in. This led to an increase in polarization and the growth of narrative-framing-specific communities that served as echo chambers for disjoint views on reality. This led to the growth of conspiracy theories and, necessarily, to the science denial and pseudoscience that reached unimaginable peaks during the COVID-19 crisis. Finally, when readers decide based on completely subjective criteria whether a scientific theory such as global warming is true or propaganda, or question what most traditional news outlets present as facts, we face the phenomenon known as fake news. Fake news leads to post-truth, a state where it is impossible to distinguish between truth and falsehood, and serves only a rhetorical function, making rational discourse impossible. Toward the end of the first chapter, the tone of writing quickly turns away from describing developments in the spread of news and facts over the last decades and quickly goes deep into philosophy, into the very thorny subject pursued by said discipline for millennia: How can truth be defined? Can different perspectives bring about different truth values for any given idea? Does truth depend on the observer, on their knowledge of facts, on their moral compass or in their honest opinions? Zoglauer dives into epistemology, following various thinkers ideas on what can be understood as truth: constructivism (whether knowledge and truth values can be learnt by an individual building from their personal experience), objectivity (whether experiences, and thus truth, are universal, or whether they are naturally individual), and whether we can proclaim something to be true when it corresponds to reality. For the final chapter, he dives into the role information and knowledge play in assigning and understanding truth value, as well as the value of second-hand knowledge: Do we really own knowledge because we can look up facts online (even if we carefully check the sources)? Can I, without any medical training, diagnose a sickness and treatment by honestly and carefully looking up its symptoms in medical databases? Wrapping up, while I very much enjoyed reading this book, I must confess it is completely different from what I expected. This book digs much more into the abstract than into information flow in modern society, or the impact on early 2020s politics as its editorial description suggests. At 160 pages, the book is not a heavy read, and Zoglauer s writing style is easy to follow, even across the potentially very deep topics it presents. Its main readership is not necessarily computing practitioners or academics. However, for people trying to better understand epistemology through its expressions in the modern world, it will be a very worthy read.

4 March 2024

Colin Watson: Free software activity in January/February 2024

Two months into my new gig and it s going great! Tracking my time has taken a bit of getting used to, but having something that amounts to a queryable database of everything I ve done has also allowed some helpful introspection. Freexian sponsors up to 20% of my time on Debian tasks of my choice. In fact I ve been spending the bulk of my time on debusine which is itself intended to accelerate work on Debian, but more details on that later. While I contribute to Freexian s summaries now, I ve also decided to start writing monthly posts about my free software activity as many others do, to get into some more detail. January 2024 February 2024

3 March 2024

Paul Wise: FLOSS Activities Feb 2024

Focus This month I didn't have any particular focus. I just worked on issues in my info bubble.

Changes

Issues

Review
  • Spam: reported 1 Debian bug report
  • Debian BTS usertags: changes for the month

Administration
  • Debian BTS: unarchive/reopen/triage bugs for reintroduced packages: ovito, tahoe-lafs, tpm2-tss-engine
  • Debian wiki: produce HTML dump for a user, unblock IP addresses, approve accounts

Communication
  • Respond to queries from Debian users and contributors on the mailing lists and IRC

Sponsors The SWH work was sponsored. All other work was done on a volunteer basis.

1 March 2024

Guido G nther: Free Software Activities February 2024

A short status update what happened last month. Work in progress is marked as WiP: GNOME Calls Phosh and Phoc As this often overlaps I've put them in a common section: Phosh Tour Phosh Mobile Settings Phosh OSK Stub Livi Video Player Phosh.mobi Website If you want to support my work see donations.

28 February 2024

Dirk Eddelbuettel: RcppEigen 0.3.4.0.0 on CRAN: New Upstream, At Last

We are thrilled to share that RcppEigen has now upgraded to Eigen release 3.4.0! The new release 0.3.4.0.0 arrived on CRAN earlier today, and has been shipped to Debian as well. Eigen is a C++ template library for linear algebra: matrices, vectors, numerical solvers, and related algorithms. This update has been in the works for a full two and a half years! It all started with a PR #102 by Yixuan bringing the package-local changes for R integration forward to usptream release 3.4.0. We opened issue #103 to steer possible changes from reverse-dependency checking through. Lo and behold, this just stalled because a few substantial changes were needed and not coming. But after a long wait, and like a bolt out of a perfectly blue sky, Andrew revived it in January with a reverse depends run of his own along with a set of PRs. That was the push that was needed, and I steered it along with a number of reverse dependency checks, and occassional emails to maintainers. We managed to bring it down to only three packages having a hickup, and all three had received PRs thanks to Andrew and even merged them. So the plan became to release today following a final fourteen day window. And CRAN was convinced by our arguments that we followed due process. So there it is! Big big thanks to all who helped it along, especially Yixuan and Andrew but also Mikael who updated another patch set he had prepared for the previous release series. The complete NEWS file entry follows.

Changes in RcppEigen version 0.3.4.0.0 (2024-02-28)
  • The Eigen version has been upgrade to release 3.4.0 (Yixuan)
  • Extensive reverse-dependency checks ensure only three out of over 400 packages at CRAN are affected; PRs and patches helped other packages
  • The long-running branch also contains substantial contributions from Mikael Jagan (for the lme4 interface) and Andrew Johnson (revdep PRs)

Courtesy of CRANberries, there is also a diffstat report for the most recent release. If you like this or other open-source work I do, you can sponsor me at GitHub.

This post by Dirk Eddelbuettel originated on his Thinking inside the box blog. Please report excessive re-aggregation in third-party for-profit settings.

24 February 2024

Niels Thykier: Language Server for Debian: Spellchecking

This is my third update on writing a language server for Debian packaging files, which aims at providing a better developer experience for Debian packagers. Lets go over what have done since the last report.
Semantic token support I have added support for what the Language Server Protocol (LSP) call semantic tokens. These are used to provide the editor insights into tokens of interest for users. Allegedly, this is what editors would use for syntax highlighting as well. Unfortunately, eglot (emacs) does not support semantic tokens, so I was not able to test this. There is a 3-year old PR for supporting with the last update being ~3 month basically saying "Please sign the Copyright Assignment". I pinged the GitHub issue in the hopes it will get unstuck. For good measure, I also checked if I could try it via neovim. Before installing, I read the neovim docs, which helpfully listed the features supported. Sadly, I did not spot semantic tokens among those and parked from there. That was a bit of a bummer, but I left the feature in for now. If you have an LSP capable editor that supports semantic tokens, let me know how it works for you! :)
Spellchecking Finally, I implemented something Otto was missing! :) This stared with Paul Wise reminding me that there were Python binding for the hunspell spellchecker. This enabled me to get started with a quick prototype that spellchecked the Description fields in debian/control. I also added spellchecking of comments while I was add it. The spellchecker runs with the standard en_US dictionary from hunspell-en-us, which does not have a lot of technical terms in it. Much less any of the Debian specific slang. I spend considerable time providing a "built-in" wordlist for technical and Debian specific slang to overcome this. I also made a "wordlist" for known Debian people that the spellchecker did not recognise. Said wordlist is fairly short as a proof of concept, and I fully expect it to be community maintained if the language server becomes a success. My second problem was performance. As I had suspected that spellchecking was not the fastest thing in the world. Therefore, I added a very small language server for the debian/changelog, which only supports spellchecking the textual part. Even for a small changelog of a 1000 lines, the spellchecking takes about 5 seconds, which confirmed my suspicion. With every change you do, the existing diagnostics hangs around for 5 seconds before being updated. Notably, in emacs, it seems that diagnostics gets translated into an absolute character offset, so all diagnostics after the change gets misplaced for every character you type. Now, there is little I could do to speed up hunspell. But I can, as always, cheat. The way diagnostics work in the LSP is that the server listens to a set of notifications like "document opened" or "document changed". In a response to that, the LSP can start its diagnostics scanning of the document and eventually publish all the diagnostics to the editor. The spec is quite clear that the server owns the diagnostics and the diagnostics are sent as a "notification" (that is, fire-and-forgot). Accordingly, there is nothing that prevents the server from publishing diagnostics multiple times for a single trigger. The only requirement is that the server publishes the accumulated diagnostics in every publish (that is, no delta updating). Leveraging this, I had the language server for debian/changelog scan the document and publish once for approximately every 25 typos (diagnostics) spotted. This means you quickly get your first result and that clears the obsolete diagnostics. Thereafter, you get frequent updates to the remainder of the document if you do not perform any further changes. That is, up to a predefined max of typos, so we do not overload the client for longer changelogs. If you do any changes, it resets and starts over. The only bit missing was dealing with concurrency. By default, a pygls language server is single threaded. It is not great if the language server hangs for 5 seconds everytime you type anything. Fortunately, pygls has builtin support for asyncio and threaded handlers. For now, I did an async handler that await after each line and setup some manual detection to stop an obsolete diagnostics run. This means the server will fairly quickly abandon an obsolete run. Also, as a side-effect of working on the spellchecking, I fixed multiple typos in the changelog of debputy. :)
Follow up on the "What next?" from my previous update In my previous update, I mentioned I had to finish up my python-debian changes to support getting the location of a token in a deb822 file. That was done, the MR is now filed, and is pending review. Hopefully, it will be merged and uploaded soon. :) I also submitted my proposal for a different way of handling relationship substvars to debian-devel. So far, it seems to have received only positive feedback. I hope it stays that way and we will have this feature soon. Guillem proposed to move some of this into dpkg, which might delay my plans a bit. However, it might be for the better in the long run, so I will wait a bit to see what happens on that front. :) As noted above, I managed to add debian/changelog as a support format for the language server. Even if it only does spellchecking and trimming of trailing newlines on save, it technically is a new format and therefore cross that item off my list. :D Unfortunately, I did not manage to write a linter variant that does not involve using an LSP-capable editor. So that is still pending. Instead, I submitted an MR against elpa-dpkg-dev-el to have it recognize all the fields that the debian/control LSP knows about at this time to offset the lack of semantic token support in eglot.
From here... My sprinting on this topic will soon come to an end, so I have to a bit more careful now with what tasks I open! I think I will narrow my focus to providing a batch linting interface. Ideally, with an auto-fix for some of the more mechanical issues, where this is little doubt about the answer. Additionally, I think the spellchecking will need a bit more maturing. My current code still trips on naming patterns that are "clearly" verbatim or code references like things written in CamelCase or SCREAMING_SNAKE_CASE. That gets annoying really quickly. It also trips on a lot of commands like dpkg-gencontrol, but that is harder to fix since it could have been a real word. I think those will have to be fixed people using quotes around the commands. Maybe the most popular ones will end up in the wordlist. Beyond that, I will play it by ear if I have any time left. :)

23 February 2024

Scarlett Gately Moore: Kubuntu: Week 3 wrap up, Contest! KDE snaps, Debian uploads.

Witch Wells AZ SunsetWitch Wells AZ Sunset
It has been a very busy 3 weeks here in Kubuntu! Kubuntu 22.04.4 LTS has been released and can be downloaded from here: https://kubuntu.org/getkubuntu/ Work done for the upcoming 24.04 LTS release: We have a branding contest! Please do enter, there are some exciting prizes https://kubuntu.org/news/kubuntu-graphic-design-contest/ Debian: I have uploaded to NEW the following packages: I am currently working on: KDE Snaps: KDE applications 23.08.5 have been uploaded to Candidate channel, testing help welcome. https://snapcraft.io/search?q=KDE I have also working on bug fixes, time allowing. My continued employment depends on you, please consider a donation! https://kubuntu.org/donate/ Thank you for stopping by! ~Scarlett

20 February 2024

Niels Thykier: Language Server (LSP) support for debian/control

About a month ago, Otto Kek l inen asked for editor extensions for debian related files on the debian-devel mailing list. In that thread, I concluded that what we were missing was a "Language Server" (LSP) for our packaging files. Last week, I started a prototype for such a LSP for the debian/control file as a starting point based on the pygls library. The initial prototype worked and I could do very basic diagnostics plus completion suggestion for field names.
Current features I got 4 basic features implemented, though I have only been able to test two of them in emacs.
  • Diagnostics or linting of basic issues.
  • Completion suggestions for all known field names that I could think of and values for some fields.
  • Folding ranges (untested). This feature enables the editor to "fold" multiple lines. It is often used with multi-line comments and that is the feature currently supported.
  • On save, trim trailing whitespace at the end of lines (untested). Might not be registered correctly on the server end.
Despite its very limited feature set, I feel editing debian/control in emacs is now a much more pleasant experience. Coming back to the features that Otto requested, the above covers a grand total of zero. Sorry, Otto. It is not you, it is me.
Completion suggestions For completion, all known fields are completed. Place the cursor at the start of the line or in a partially written out field name and trigger the completion in your editor. In my case, I can type R-R-R and trigger the completion and the editor will automatically replace it with Rules-Requires-Root as the only applicable match. Your milage may vary since I delegate most of the filtering to the editor, meaning the editor has the final say about whether your input matches anything. The only filtering done on the server side is that the server prunes out fields already used in the paragraph, so you are not presented with the option to repeat an already used field, which would be an error. Admittedly, not an error the language server detects at the moment, but other tools will. When completing field, if the field only has one non-default value such as Essential which can be either no (the default, but you should not use it) or yes, then the completion suggestion will complete the field along with its value. This is mostly only applicable for "yes/no" fields such as Essential and Protected. But it does also trigger for Package-Type at the moment. As for completing values, here the language server can complete the value for simple fields such as "yes/no" fields, Multi-Arch, Package-Type and Priority. I intend to add support for Section as well - maybe also Architecture.
Diagnostics On the diagnostic front, I have added multiple diagnostics:
  • An error marker for syntax errors.
  • An error marker for missing a mandatory field like Package or Architecture. This also includes Standards-Version, which is admittedly mandatory by policy rather than tooling falling part.
  • An error marker for adding Multi-Arch: same to an Architecture: all package.
  • Error marker for providing an unknown value to a field with a set of known values. As an example, writing foo in Multi-Arch would trigger this one.
  • Warning marker for using deprecated fields such as DM-Upload-Allowed, or when setting a field to its default value for fields like Essential. The latter rule only applies to selected fields and notably Multi-Arch: no does not trigger a warning.
  • Info level marker if a field like Priority duplicates the value of the Source paragraph.
Notable omission at this time:
  • No errors are raised if a field does not have a value.
  • No errors are raised if a field is duplicated inside a paragraph.
  • No errors are used if a field is used in the wrong paragraph.
  • No spellchecking of the Description field.
  • No understanding that Foo and X[CBS]-Foo are related. As an example, XC-Package-Type is completely ignored despite being the old name for Package-Type.
  • Quick fixes to solve these problems... :)
Trying it out If you want to try, it is sadly a bit more involved due to things not being uploaded or merged yet. Also, be advised that I will regularly rebase my git branches as I revise the code. The setup:
  • Build and install the deb of the main branch of pygls from https://salsa.debian.org/debian/pygls The package is in NEW and hopefully this step will soon just be a regular apt install.
  • Build and install the deb of the rts-locatable branch of my python-debian fork from https://salsa.debian.org/nthykier/python-debian There is a draft MR of it as well on the main repo.
  • Build and install the deb of the lsp-support branch of debputy from https://salsa.debian.org/debian/debputy
  • Configure your editor to run debputy lsp debian/control as the language server for debian/control. This is depends on your editor. I figured out how to do it for emacs (see below). I also found a guide for neovim at https://neovim.io/doc/user/lsp. Note that debputy can be run from any directory here. The debian/control is a reference to the file format and not a concrete file in this case.
Obviously, the setup should get easier over time. The first three bullet points should eventually get resolved by merges and upload meaning you end up with an apt install command instead of them. For the editor part, I would obviously love it if we can add snippets for editors to make the automatically pick up the language server when the relevant file is installed.
Using the debputy LSP in emacs The guide I found so far relies on eglot. The guide below assumes you have the elpa-dpkg-dev-el package installed for the debian-control-mode. Though it should be a trivially matter to replace debian-control-mode with a different mode if you use a different mode for your debian/control file. In your emacs init file (such as ~/.emacs or ~/.emacs.d/init.el), you add the follow blob.
(with-eval-after-load 'eglot
    (add-to-list 'eglot-server-programs
        '(debian-control-mode . ("debputy" "lsp" "debian/control"))))
Once you open the debian/control file in emacs, you can type M-x eglot to activate the language server. Not sure why that manual step is needed and if someone knows how to automate it such that eglot activates automatically on opening debian/control, please let me know. For testing completions, I often have to manually activate them (with C-M-i or M-x complete-symbol). Though, it is a bit unclear to me whether this is an emacs setting that I have not toggled or something I need to do on the language server side.
From here As next steps, I will probably look into fixing some of the "known missing" items under diagnostics. The quick fix would be a considerable improvement to assisting users. In the not so distant future, I will probably start to look at supporting other files such as debian/changelog or look into supporting configuration, so I can cover formatting features like wrap-and-sort. I am also very much open to how we can provide integrations for this feature into editors by default. I will probably create a separate binary package for specifically this feature that pulls all relevant dependencies that would be able to provide editor integrations as well.

13 February 2024

Arturo Borrero Gonz lez: Back to the Wikimedia Foundation!

Wikimedia Foundation logo In October 2023, I departed from the Wikimedia Foundation, the non-profit organization behind well-known projects like Wikipedia and others, to join Spryker. However, in January 2024 Spryker conducted a round of layoffs reportedly due to budget and business reasons. I was among those affected, being let go just three months after joining the company. Fortunately, the Wikimedia Cloud Services team, where I previously worked, was still seeking to backfill my position, so I reached out to them. They graciously welcomed me back as a Senior Site Reliability Engineer, in the same team and position as before. Although this three-month career detour wasn t the outcome I initially envisioned, I found it to be a valuable experience. During this time, I gained knowledge in a new tech stack, based on AWS, and discovered new engineering methodologies. Additionally, I had the opportunity to meet some wonderful individuals. I believe I have emerged stronger from this experience. Returning to the Wikimedia Foundation is truly motivating. It feels privileged to be part of this mature organization, its community, and movement, with its inspiring mission and values. In addition, I m hoping that this also means I can once again dedicate a bit more attention to my FLOSS activities, such as my duties within the Debian project. My email address is back online: aborrero@wikimedia.org. You can find me again in the IRC libera.chat server, in the usual wikimedia channels, nick arturo.

11 February 2024

Freexian Collaborators: Debian Contributions: Upcoming Improvements to Salsa CI, /usr-move, and more! (by Utkarsh Gupta)

Contributing to Debian is part of Freexian s mission. This article covers the latest achievements of Freexian and their collaborators. All of this is made possible by organizations subscribing to our Long Term Support contracts and consulting services.

Upcoming Improvements to Salsa CI, by Santiago Ruano Rinc n Santiago started picking up the work made by Outreachy Intern, Enock Kashada (a big thanks to him!), to solve some long-standing issues in Salsa CI. Currently, the first job in a Salsa CI pipeline is the extract-source job, used to produce a debianize source tree of the project. This job was introduced to make it possible to build the projects on different architectures, on the subsequent build jobs. However, that extract-source approach is sub-optimal: not only it increases the execution time of the pipeline by some minutes, but also projects whose source tree is too large are not able to use the pipeline. The debianize source tree is passed as an artifact to the build jobs, and for those large projects, the size of their source tree exceeds the Salsa s limits. This is specific issue is documented as issue #195, and the proposed solution is to get rid of the extract-source job, relying on sbuild in the very build job (see issue #296). Switching to sbuild would also help to improve the build source job, solving issues such as #187 and #298. The current work-in-progress is very preliminary, but it has already been possible to run the build (amd64), build-i386 and build-source job using sbuild with the unshare mode. The image on the right shows a pipeline that builds grep. All the test jobs use the artifacts of the new build job. There is a lot of remaining work, mainly making the integration with ccache work. This change could break some things, it will also be important to test how the new pipeline works with complex projects. Also, thanks to Emmanuel Arias, we are proposing a Google Summer of Code 2024 project to improve Salsa CI. As part of the ongoing work in preparation for the GSoC 2024 project, Santiago has proposed a merge request to make more efficient how contributors can test their changes on the Salsa CI pipeline.

/usr-move, by Helmut Grohne In January, we sent most of the moving patches for the set of packages involved with debootstrap. Notably missing is glibc, which turns out harder than anticipated via dumat, because it has Conflicts between different architectures, which dumat does not analyze. Patches for diversion mitigations have been updated in a way to not exhibit any loss anymore. The main change here is that packages which are being diverted now support the diverting packages in transitioning their diversions. We also supported a few packages with non-trivial changes such as netplan.io. dumat has been enhanced to better support derivatives such as Ubuntu.

Miscellaneous contributions
  • Python 3.12 migration trundles on. Stefano Rivera helped port several new packages to support 3.12.
  • Stefano updated the Sphinx configuration of DebConf Video Team s documentation, which was broken by Sphinx 7.
  • Stefano published the videos from the Cambridge MiniDebConf to YouTube and PeerTube.
  • DebConf 24 planning has begun, and Stefano & Utkarsh have started work on this.
  • Utkarsh re-sponsored the upload of golang-github-prometheus-community-pgbouncer-exporter for Lena.
  • Colin Watson added Incus support to autopkgtest.
  • Colin discovered Perl::Critic and used it to tidy up some poor practices in several of his packages, including debconf.
  • Colin did some overdue debconf maintenance, mainly around tidying up error message handling in several places (1, 2, 3).
  • Colin figured out how to update the mirror size documentation in debmirror, last updated in 2010. It should now be much easier to keep it up to date regularly.
  • Colin issued a man-db buster update to clean up some irritations due to strict sandboxing.
  • Thorsten Alteholz adopted two more packages, magicfilter and ifhp, for the debian-printing team. Those packages are the last ones of the latest round of adoptions to preserve the old printing protocol within Debian. If you know of other packages that should be retained, please don t hesitate to contact Thorsten.
  • Enrico participated in /usr-merge discussions with Helmut.
  • Helmut sent patches for 16 cross build failures.
  • Helmut supported Matthias Klose (not affiliated with Freexian) with adding -for-host support to gcc-defaults.
  • Helmut uploaded dput-ng enabling dcut migrate and merging two MRs of Ben Hutchings.
  • Santiago took part in the discussions relating to the EU Cyber Resilience Act (CRA) and the Debian public statement that was published last year. He participated in a meeting with Members of the European Parliament (MEPs), Marcel Kolaja and Karen Melchior, and their teams to clarify some points about the impact of the CRA and Debian and downstream projects, and the improvements in the last version of the proposed regulation.

7 February 2024

Reproducible Builds: Reproducible Builds in January 2024

Welcome to the January 2024 report from the Reproducible Builds project. In these reports we outline the most important things that we have been up to over the past month. If you are interested in contributing to the project, please visit our Contribute page on our website.

How we executed a critical supply chain attack on PyTorch John Stawinski and Adnan Khan published a lengthy blog post detailing how they executed a supply-chain attack against PyTorch, a popular machine learning platform used by titans like Google, Meta, Boeing, and Lockheed Martin :
Our exploit path resulted in the ability to upload malicious PyTorch releases to GitHub, upload releases to [Amazon Web Services], potentially add code to the main repository branch, backdoor PyTorch dependencies the list goes on. In short, it was bad. Quite bad.
The attack pivoted on PyTorch s use of self-hosted runners as well as submitting a pull request to address a trivial typo in the project s README file to gain access to repository secrets and API keys that could subsequently be used for malicious purposes.

New Arch Linux forensic filesystem tool On our mailing list this month, long-time Reproducible Builds developer kpcyrd announced a new tool designed to forensically analyse Arch Linux filesystem images. Called archlinux-userland-fs-cmp, the tool is supposed to be used from a rescue image (any Linux) with an Arch install mounted to, [for example], /mnt. Crucially, however, at no point is any file from the mounted filesystem eval d or otherwise executed. Parsers are written in a memory safe language. More information about the tool can be found on their announcement message, as well as on the tool s homepage. A GIF of the tool in action is also available.

Issues with our SOURCE_DATE_EPOCH code? Chris Lamb started a thread on our mailing list summarising some potential problems with the source code snippet the Reproducible Builds project has been using to parse the SOURCE_DATE_EPOCH environment variable:
I m not 100% sure who originally wrote this code, but it was probably sometime in the ~2015 era, and it must be in a huge number of codebases by now. Anyway, Alejandro Colomar was working on the shadow security tool and pinged me regarding some potential issues with the code. You can see this conversation here.
Chris ends his message with a request that those with intimate or low-level knowledge of time_t, C types, overflows and the various parsing libraries in the C standard library (etc.) contribute with further info.

Distribution updates In Debian this month, Roland Clobus posted another detailed update of the status of reproducible ISO images on our mailing list. In particular, Roland helpfully summarised that all major desktops build reproducibly with bullseye, bookworm, trixie and sid provided they are built for a second time within the same DAK run (i.e. [within] 6 hours) . Additionally 7 of the 8 bookworm images from the official download link build reproducibly at any later time. In addition to this, three reviews of Debian packages were added, 17 were updated and 15 were removed this month adding to our knowledge about identified issues. Elsewhere, Bernhard posted another monthly update for his work elsewhere in openSUSE.

Community updates There were made a number of improvements to our website, including Bernhard M. Wiedemann fixing a number of typos of the term nondeterministic . [ ] and Jan Zerebecki adding a substantial and highly welcome section to our page about SOURCE_DATE_EPOCH to document its interaction with distribution rebuilds. [ ].
diffoscope is our in-depth and content-aware diff utility that can locate and diagnose reproducibility issues. This month, Chris Lamb made a number of changes such as uploading versions 254 and 255 to Debian but focusing on triaging and/or merging code from other contributors. This included adding support for comparing eXtensible ARchive (.XAR/.PKG) files courtesy of Seth Michael Larson [ ][ ], as well considerable work from Vekhir in order to fix compatibility between various and subtle incompatible versions of the progressbar libraries in Python [ ][ ][ ][ ]. Thanks!

Reproducibility testing framework The Reproducible Builds project operates a comprehensive testing framework (available at tests.reproducible-builds.org) in order to check packages and other artifacts for reproducibility. In January, a number of changes were made by Holger Levsen:
  • Debian-related changes:
    • Reduce the number of arm64 architecture workers from 24 to 16. [ ]
    • Use diffoscope from the Debian release being tested again. [ ]
    • Improve the handling when killing unwanted processes [ ][ ][ ] and be more verbose about it, too [ ].
    • Don t mark a job as failed if process marked as to-be-killed is already gone. [ ]
    • Display the architecture of builds that have been running for more than 48 hours. [ ]
    • Reboot arm64 nodes when they hit an OOM (out of memory) state. [ ]
  • Package rescheduling changes:
    • Reduce IRC notifications to 1 when rescheduling due to package status changes. [ ]
    • Correctly set SUDO_USER when rescheduling packages. [ ]
    • Automatically reschedule packages regressing to FTBFS (build failure) or FTBR (build success, but unreproducible). [ ]
  • OpenWrt-related changes:
    • Install the python3-dev and python3-pyelftools packages as they are now needed for the sunxi target. [ ][ ]
    • Also install the libpam0g-dev which is needed by some OpenWrt hardware targets. [ ]
  • Misc:
    • As it s January, set the real_year variable to 2024 [ ] and bump various copyright years as well [ ].
    • Fix a large (!) number of spelling mistakes in various scripts. [ ][ ][ ]
    • Prevent Squid and Systemd processes from being killed by the kernel s OOM killer. [ ]
    • Install the iptables tool everywhere, else our custom rc.local script fails. [ ]
    • Cleanup the /srv/workspace/pbuilder directory on boot. [ ]
    • Automatically restart Squid if it fails. [ ]
    • Limit the execution of chroot-installation jobs to a maximum of 4 concurrent runs. [ ][ ]
Significant amounts of node maintenance was performed by Holger Levsen (eg. [ ][ ][ ][ ][ ][ ][ ] etc.) and Vagrant Cascadian (eg. [ ][ ][ ][ ][ ][ ][ ][ ]). Indeed, Vagrant Cascadian handled an extended power outage for the network running the Debian armhf architecture test infrastructure. This provided the incentive to replace the UPS batteries and consolidate infrastructure to reduce future UPS load. [ ] Elsewhere in our infrastructure, however, Holger Levsen also adjusted the email configuration for @reproducible-builds.org to deal with a new SMTP email attack. [ ]

Upstream patches The Reproducible Builds project tries to detects, dissects and fix as many (currently) unreproducible packages as possible. We endeavour to send all of our patches upstream where appropriate. This month, we wrote a large number of such patches, including: Separate to this, Vagrant Cascadian followed up with the relevant maintainers when reproducibility fixes were not included in newly-uploaded versions of the mm-common package in Debian this was quickly fixed, however. [ ]

If you are interested in contributing to the Reproducible Builds project, please visit our Contribute page on our website. However, you can get in touch with us via:

2 February 2024

Ben Hutchings: FOSS activity in January 2024

Ben Hutchings: FOSS activity in December 2023

1 February 2024

Paul Wise: FLOSS Activities January 2024

Focus This month I didn't have any particular focus. I just worked on issues in my info bubble.

Changes
  • OpenStreetMap: fixed a bunch of broken website URLs
  • Debian pass-otp: oathtool safety
  • reportbug: fix crash
  • Debian BTS usertags: fix Python, Ruby, QA, porter, archive, release tags
  • Debian wiki pages: TransitionUploadHook

Issues

Review
  • Debian BTS usertags: changes for the month
  • Debian screenshots:

Administration
  • Debian wiki: unblock IP addresses, approve accounts

Communication
  • Respond to queries from Debian users and contributors on the mailing lists and IRC

Sponsors All work was done on a volunteer basis.

28 January 2024

Niels Thykier: Annotating the Debian packaging directory

In my previous blog post Providing online reference documentation for debputy, I made a point about how debhelper documentation was suboptimal on account of being static rather than online. The thing is that debhelper is not alone in this problem space, even if it is a major contributor to the number of packaging files you have to to know about. If we look at the "competition" here such as Fedora and Arch Linux, they tend to only have one packaging file. While most Debian people will tell you a long list of cons about having one packaging file (such a Fedora's spec file being 3+ domain specific languages "mashed" into one file), one major advantage is that there is only "the one packaging file". You only need to remember where to find the documentation for one file, which is great when you are running on wetware with limited storage capacity. Which means as a newbie, you can dedicate less mental resources to tracking multiple files and how they interact and more effort understanding the "one file" at hand. I started by asking myself how can we in Debian make the packaging stack more accessible to newcomers? Spoiler alert, I dug myself into rabbit hole and ended up somewhere else than where I thought I was going. I started by wanting to scan the debian directory and annotate all files that I could with documentation links. The logic was that if debputy could do that for you, then you could spend more mental effort elsewhere. So I combined debputy's packager provided files detection with a static list of files and I quickly had a good starting point for debputy-based packages.
Adding (non-static) dpkg and debhelper files to the mix Now, I could have closed the topic here and said "Look, I did debputy files plus couple of super common files". But I decided to take it a bit further. I added support for handling some dpkg files like packager provided files (such as debian/substvars and debian/symbols). But even then, we all know that debhelper is the big hurdle and a major part of the omission... In another previous blog post (A new Debian package helper: debputy), I made a point about how debputy could list all auxiliary files while debhelper could not. This was exactly the kind of feature that I would need for this feature, if this feature was to cover debhelper. Now, I also remarked in that blog post that I was not willing to maintain such a list. Also, I may have ranted about static documentation being unhelpful for debhelper as it excludes third-party provided tooling. Fortunately, a recent update to dh_assistant had provided some basic plumbing for loading dh sequences. This meant that getting a list of all relevant commands for a source package was a lot easier than it used to be. Once you have a list of commands, it would be possible to check all of them for dh's NOOP PROMISE hints. In these hints, a command can assert it does nothing if a given pkgfile is not present. This lead to the new dh_assistant list-guessed-dh-config-files command that will list all declared pkgfiles and which helpers listed them. With this combined feature set in place, debputy could call dh_assistant to get a list of pkgfiles, pretend they were packager provided files and annotate those along with manpage for the relevant debhelper command. The exciting thing about letting debpputy resolve the pkgfiles is that debputy will resolve "named" files automatically (debhelper tools will only do so when --name is passed), so it is much more likely to detect named pkgfiles correctly too. Side note: I am going to ignore the elephant in the room for now, which is dh_installsystemd and its package@.service files and the wide-spread use of debian/foo.service where there is no package called foo. For the latter case, the "proper" name would be debian/pkg.foo.service. With the new dh_assistant feature done and added to debputy, debputy could now detect the ubiquitous debian/install file. Excellent. But less great was that the very common debian/docs file was not. Turns out that dh_installdocs cannot be skipped by dh, so it cannot have NOOP PROMISE hints. Meh... Well, dh_assistant could learn about a new INTROSPECTABLE marker in addition to the NOOP PROMISE and then I could sprinkle that into a few commands. Indeed that worked and meant that debian/postinst (etc.) are now also detectable. At this point, debputy would be able to identify a wide range of debhelper related configuration files in debian/ and at least associate each of them with one or more commands. Nice, surely, this would be a good place to stop, right...?
Adding more metadata to the files The debhelper detected files only had a command name and manpage URI to that command. It would be nice if we could contextualize this a bit more. Like is this file installed into the package as is like debian/pam or is it a file list to be processed like debian/install. To make this distinction, I could add the most common debhelper file types to my static list and then merge the result together. Except, I do not want to maintain a full list in debputy. Fortunately, debputy has a quite extensible plugin infrastructure, so added a new plugin feature to provide this kind of detail and now I can outsource the problem! I split my definitions into two and placed the generic ones in the debputy-documentation plugin and moved the debhelper related ones to debhelper-documentation. Additionally, third-party dh addons could provide their own debputy plugin to add context to their configuration files. So, this gave birth file categories and configuration features, which described each file on different fronts. As an example, debian/gbp.conf could be tagged as a maint-config to signal that it is not directly related to the package build but more of a tool or style preference file. On the other hand, debian/install and debian/debputy.manifest would both be tagged as a pkg-helper-config. Files like debian/pam were tagged as ppf-file for packager provided file and so on. I mentioned configuration features above and those were added because, I have had a beef with debhelper's "standard" configuration file format as read by filearray and filedoublearray. They are often considered simple to understand, but it is hard to know how a tool will actually read the file. As an example, consider the following:
  • Will the debhelper use filearray, filedoublearray or none of them to read the file? This topic has about 2 bits of entropy.
  • Will the config file be executed if it is marked executable assuming you are using the right compat level? If it is executable, does dh-exec allow renaming for this file? This topic adds 1 or 2 bit of entropy depending on the context.
  • Will the config file be subject to glob expansions? This topic sounds like a boolean but is a complicated mess. The globs can be handled either by debhelper as it parses the file for you. In this case, the globs are applied to every token. However, this is not what dh_install does. Here the last token on each line is supposed to be a directory and therefore not subject to globs. Therefore, dh_install does the globbing itself afterwards but only on part of the tokens. So that is about 2 bits of entropy more. Actually, it gets worse...
    • If the file is executed, debhelper will refuse to expand globs in the output of the command, which was a deliberate design choice by the original debhelper maintainer took when he introduced the feature in debhelper/8.9.12. Except, dh_install feature interacts with the design choice and does enable glob expansion in the tool output, because it does so manually after its filedoublearray call.
So these "simple" files have way too many combinations of how they can be interpreted. I figured it would be helpful if debputy could highlight these difference, so I added support for those as well. Accordingly, debian/install is tagged with multiple tags including dh-executable-config and dh-glob-after-execute. Then, I added a datatable of these tags, so it would be easy for people to look up what they meant. Ok, this seems like a closed deal, right...?
Context, context, context However, the dh-executable-config tag among other are only applicable in compat 9 or later. It does not seem newbie friendly if you are told that this feature exist, but then have to read in the extended description that that it actually does not apply to your package. This problem seems fixable. Thanks to dh_assistant, it is easy to figure out which compat level the package is using. Then tweak some metadata to enable per compat level rules. With that tags like dh-executable-config only appears for packages using compat 9 or later. Also, debputy should be able to tell you where packager provided files like debian/pam are installed. We already have the logic for packager provided files that debputy supports and I am already using debputy engine for detecting the files. If only the plugin provided metadata gave me the install pattern, debputy would be able tell you where this file goes in the package. Indeed, a bit of tweaking later and setting install-pattern to usr/lib/pam.d/ name , debputy presented me with the correct install-path with the package name placing the name placeholder. Now, I have been using debian/pam as an example, because debian/pam is installed into usr/lib/pam.d in compat 14. But in earlier compat levels, it was installed into etc/pam.d. Well, I already had an infrastructure for doing compat file tags. Off we go to add install-pattern to the complat level infrastructure and now changing the compat level would change the path. Great. (Bug warning: The value is off-by-one in the current version of debhelper. This is fixed in git) Also, while we are in this install-pattern business, a number of debhelper config files causes files to be installed into a fixed directory. Like debian/docs which causes file to be installed into /usr/share/docs/ package . Surely, we can expand that as well and provide that bit of context too... and done. (Bug warning: The code currently does not account for the main documentation package context) It is rather common pattern for people to do debian/foo.in files, because they want to custom generation of debian/foo. Which means if you have debian/foo you get "Oh, let me tell you about debian/foo ". Then you rename it to debian/foo.in and the result is "debian/foo.in is a total mystery to me!". That is suboptimal, so lets detect those as well as if they were the original file but add a tag saying that they are a generate template and which file we suspect it generates. Finally, if you use debputy, almost all of the standard debhelper commands are removed from the sequence, since debputy replaces them. It would be weird if these commands still contributed configuration files when they are not actually going to be invoked. This mostly happened naturally due to the way the underlying dh_assistant command works. However, any file mentioned by the debhelper-documentation plugin would still appear unfortunately. So off I went to filter the list of known configuration files against which dh_ commands that dh_assistant thought would be used for this package.
Wrapping it up I was several layers into this and had to dig myself out. I have ended up with a lot of data and metadata. But it was quite difficult for me to arrange the output in a user friendly manner. However, all this data did seem like it would be useful any tool that wants to understand more about the package. So to get out of the rabbit hole, I for now wrapped all of this into JSON and now we have a debputy tool-support annotate-debian-directory command that might be useful for other tools. To try it out, you can try the following demo: In another day, I will figure out how to structure this output so it is useful for non-machine consumers. Suggestions are welcome. :)
Limitations of the approach As a closing remark, I should probably remind people that this feature relies heavily on declarative features. These include:
  • When determining which commands are relevant, using Build-Depends: dh-sequence-foo is much more reliable than configuring it via the Turing complete configuration we call debian/rules.
  • When debhelper commands use NOOP promise hints, dh_assistant can "see" the config files listed those hints, meaning the file will at least be detected. For new introspectable hint and the debputy plugin, it is probably better to wait until the dust settles a bit before adding any of those.
You can help yourself and others to better results by using the declarative way rather than using debian/rules, which is the bane of all introspection!

26 January 2024

Bastian Venthur: Investigating popularity of Python build backends over time

Inspired by a Mastodon post by Fran oise Conil, who investigated the current popularity of build backends used in pyproject.toml files, I wanted to investigate how the popularity of build backends used in pyproject.toml files evolved over the years since the introduction of PEP-0517 in 2015. Getting the data Tom Forbes provides a huge dataset that contains information about every file within every release uploaded to PyPI. To get the current dataset, we can use:
curl -L --remote-name-all $(curl -L "https://github.com/pypi-data/data/raw/main/links/dataset.txt")
This will download approximately 30GB of parquet files, providing detailed information about each file included in a PyPI upload, including:
  1. project name, version and release date
  2. file path, size and line count
  3. hash of the file
The dataset does not contain the actual files themselves though, more on that in a moment. Querying the dataset using duckdb We can now use duckdb to query the parquet files directly. Let s look into the schema first:
describe select * from '*.parquet';
 
    column_name     column_type    null    
      varchar         varchar     varchar  
 
  project_name      VARCHAR       YES      
  project_version   VARCHAR       YES      
  project_release   VARCHAR       YES      
  uploaded_on       TIMESTAMP     YES      
  path              VARCHAR       YES      
  archive_path      VARCHAR       YES      
  size              UBIGINT       YES      
  hash              BLOB          YES      
  skip_reason       VARCHAR       YES      
  lines             UBIGINT       YES      
  repository        UINTEGER      YES      
 
  11 rows                       6 columns  
 
From all files mentioned in the dataset, we only care about pyproject.toml files that are in the project s root directory. Since we ll still have to download the actual files, we need to get the path and the repository to construct the corresponding URL to the mirror that contains all files in a bunch of huge git repositories. Some files are not available on the mirrors; to skip these, we only take files where the skip_reason is empty. We also care about the timestamp of the upload (uploaded_on) and the hash to avoid processing identical files twice:
select
    path,
    hash,
    uploaded_on,
    repository
from '*.parquet'
where
    skip_reason == '' and
    lower(string_split(path, '/')[-1]) == 'pyproject.toml' and
    len(string_split(path, '/')) == 5
order by uploaded_on desc
This query runs for a few minutes on my laptop and returns ~1.2M rows. Getting the actual files Using the repository and path, we can now construct an URL from which we can fetch the actual file for further processing:
url = f"https://raw.githubusercontent.com/pypi-data/pypi-mirror- repository /code/ path "
We can download the individual pyproject.toml files and parse them to read the build-backend into a dictionary mapping the file-hash to the build backend. Downloads on GitHub are rate-limited, so downloading 1.2M files will take a couple of days. By skipping files with a hash we ve already processed, we can avoid downloading the same file more than once, cutting the required downloads by circa 50%. Results Assuming the data is complete and my analysis is sound, these are the findings: There is a surprising amount of build backends in use, but the overall amount of uploads per build backend decreases quickly, with a long tail of single uploads:
>>> results.backend.value_counts()
backend
setuptools        701550
poetry            380830
hatchling          56917
flit               36223
pdm                11437
maturin             9796
jupyter             1707
mesonpy              625
scikit               556
                   ...
postry                 1
tree                   1
setuptoos              1
neuron                 1
avalon                 1
maturimaturinn         1
jsonpath               1
ha                     1
pyo3                   1
Name: count, Length: 73, dtype: int64
We pick only the top 4 build backends, and group the remaining ones (including PDM and Maturin) into other so they are accounted for as well. The following plot shows the relative distribution of build backends over time. Each bin represents a time span of 28 days. I chose 28 days to reduce visual clutter. Within each bin, the height of the bars corresponds to the relative proportion of uploads during that time interval: Relative distribution of build backends over time Looking at the right side of the plot, we see the current distribution. It confirms Fran oise s findings about the current popularity of build backends: Between 2018 and 2020 the graph exhibits significant fluctuations, due to the relatively low amount uploads utizing pyproject.toml files. During that early period, Flit started as the most popular build backend, but was eventually displaced by Setuptools and Poetry. Between 2020 and 2020, the overall usage of pyproject.toml files increased significantly. By the end of 2022, the share of Setuptools peaked at 70%. After 2020, other build backends experienced a gradual rise in popularity. Amongh these, Hatch emerged as a notable contender, steadily gaining traction and ultimately stabilizing at 10%. We can also look into the absolute distribution of build backends over time: Absolute distribution of build backends over time The plot shows that Setuptools has the strongest growth trajectory, surpassing all other build backends. Poetry and Hatch are growing at a comparable rate, but since Hatch started roughly 4 years after Poetry, it s lagging behind in popularity. Despite not being among the most widely used backends anymore, Flit maintains a steady and consistent growth pattern, indicating its enduring relevance in the Python packaging landscape. The script for downloading and analyzing the data can be found in my GitHub repository. It contains the results of the duckb query (so you don t have to download the full dataset) and the pickled dictionary, mapping the file hashes to the build backends, saving you days for downloading and analyzing the pyproject.toml files yourself.

Next.

Previous.